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Section 1: Introduction 


The importance of open and timely communication by trusted scholarly publishers has arguably 
never been greater. The COVID-19 pandemic illustrated the value of rapid and open dissemination 
of research findings (Tavernier, 2020), and the rise in open access publishing means that restrictions 
to accessing research are increasingly being removed. At the same time, new technologies such as 
artificial intelligence (Al) hold the promise to fuel a rapid increase in scientific discovery (Economist, 
2023), remove language barriers to writing papers, and accelerate the use of the scholarly record to 
develop new hypotheses (Extance, 2018). 


Yet alongside this promise, the scholarly publishing industry faces a deepening crisis - one where 
some of the drivers of progress also pose an increasing threat (Liverpool, 2023). We are in a time 
when researchers face increasing external pressures to publish for career advancement while 
fabricating a paper using generative Al has never been easier. These factors make journals a target 
for the continued and systematic manipulation of the publication process via paper mills and other 
forms of problematic activity. The last few years have seen increasing examples of publishers facing 
these challenges at an unprecedented scale (Else and van Noorden, 2021, van Noorden, 2023). 


Although far from being the only publisher affected, the last 18 months have seen Hindawi at 
the eye of this storm of publication manipulation at scale, and we are on a continued journey 
through it. Hindawi and Wiley have together learned many lessons about how bad actors infiltrate 


publishing systems, and we have taken new approaches to clean up the scholarly record at scale 
and protect against further attacks. We believe that sharing our experience with others across the 
industry plays an important part in developing a collaborative approach to tackling systematic 
manipulation at scale. 


In this article, we describe what paper mills are and how we believe they operate, as well as other 
— often related — unethical activities that are distorting the scholarly landscape and endangering 
public trust in research. These activities include unethical authorship practices, the use of Al in 
manuscript fabrication and image manipulation, and peer review manipulation. We explore the 
significant impacts and disruptions resulting from these factors and the steps Hindawi has taken- 
and is continuing to take — to investigate and clean up the scholarly record through “designing a 
new retraction process that will help us, and potentially others, accelerate and deal with this new 
era of mass retractions fairly” (Flynn 2023). 


We conclude with a call to all stakeholders within the scholarly publishing ecosystem, with 
recommendations to work together and introduce a collaborative approach to journal security 
and research integrity. By sharing our approach to the challenges that have led to such widescale 
and deliberate undermining of trust in research, the hope is that others can build on and refine the 
process, helping to further ensure publishing systems are less vulnerable to manipulation in future. 
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Section 2: An industry-wide problem 


There is a “continuation of unethical activity” in the publishing world that represents a “major and 
ongoing challenge for the entire scholarly publishing industry and all who rely on the integrity 

of the scholarly record” (Ferguson, 2022). Such activity could range from inappropriate research 
and publishing practices due to lack of training through “misconduct, fraud, or even criminality” 
(Kolstoe, 2023; UKRIO, 2023). Whether it is due to researchers feeling the pressure to publish 
(Oransky and Marcus, 2023) or through “fake paper factories” operating at scale (Else and van 
Noorden, 2021, van Noorden, 2023), publishing processes have been compromised and publication 
misconduct is widespread. 


Unethical authorship practices 


The academic culture of ‘publish or perish’ has incentivized unethical behaviour as researchers 
feel intense pressure to produce and publish high volumes of articles to advance in their careers 
(Camacho, 2021). In recent years, there has been a concerning surge in authorship misconduct, a 
trend that poses a serious threat to the integrity of scholarly work. Unethical practices that fall into 
this category include ghost-writing, guest authorship, and authorship for sale; while not new, they 
are becoming more prevalent. 


Some researchers are publishing an astonishing number of papers, with certain authors putting 
their name to hundreds of articles within a single year. This includes papers that seemingly are 
not within their usual field of expertise and alongside unrelated co-authors (Ansede, 2023). On 
the flip side, authorship positions are also being offered for sale. Although it is difficult to identify 
the specific companies concerned, advertisements selling co-authorship have been found (e.g, 
Abalkina, 2021, Kincaid, 2023). 


These deceptive tactics not only compromise the credibility of academic publications but 
also undermine the principles of transparency and accountability that make scholarly research 
trustworthy and which are crucial to innovation and advancing knowledge. 


The use of Al in manuscript fabrication and image manipulation 


Artificial intelligence will be a transformative tool for good in scholarly research that will help fuel 
innovation, but it is also a powerful instrument for unethical activity in the wrong hands, giving rise 
to the creation of deceptive ‘original’ research. Al tools can be exploited to generate fabricated data, 
manipulate images, and even devise entire studies with seemingly legitimate results (Naddaf, 2023). 
This poses a significant threat to the credibility of scientific inquiry, as researchers and publishers 
may struggle to discern between authentic and manipulated content. 


Al “brings challenges to research integrity and accuracy, including the potential increase in 
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plagiarism, image manipulation, and paper mills” (Zhou, 2023), through its ability to quickly fabricate 
research and evade detection by standard checks. Detecting synthetic output from large language 
models is a society-wide challenge, and as of today there are no reliable methods to do so. With 
increasingly sophisticated language models such as ChatGPT, plagiarism is likely to become 
undetectable and Al-human writing will become a normal part of the creative process (Eaton, 
2023). Humans, however, will remain responsible for the trustworthiness of scholarly outputs. The 
challenge in such a ‘post-plagiarism world’ is how an ethical framework can be created and taught 
that ensures appropriate attribution and holds researchers, publishers, funders, institutions, and 
others to account for the integrity of scholarly work (Eaton, 2023). 


Paper mills 


Paper mills have been most recently defined by the Committee on Publication Ethics (COPE) as: 
“An individual, group of individuals, or organisation that aims to manipulate the publication process 
to achieve the publication of articles for the purposes of financial gain” (COPE Council, 2023a). 
Awareness of firms selling papers to researchers or arranging to bypass peer review goes back to at 
least 2010 (Else and Van Noorden, 2021). In the past two years, however, there has been a dramatic 
increase in systematic manipulation and paper mills are now a serious “sort of industrial scale fraud” 
(Nunes and Bishop, 2023). 


Paper mills contaminate the scholarly literature with fabricated or nonsensical results. If these 
papers are discovered by the publisher or the academic community, they require retraction by the 
publisher in order to correct the publication record. The discovery that a papermill has infiltrated a 
journal can have severe consequences, including the loss of credibility and trust among researchers 
and removal from scholarly indexes. 


Journals soliciting proposals for Special Issues (SIs) can present particularly attractive targets for 
paper mills, where ‘Guest Editors’ are recruited to take a temporary editorial role in a journal. Guest 
Editors can submit proposals for Sls, help solicit the papers themselves, and oversee and contribute 
to peer review. Strategies of paper mills include manipulating the identities of Guest Editors, 
authors, and even reviewers (to appear as genuine researchers) and/or fabricating content (to 
appear as legitimate papers). 


The infiltration of an SI within a journal by a papermill often involves some or all of the following 
steps: 


1. Create, manipulate, or hijack the identity of a researcher who will act as the SI Guest Editor, or 
pay a genuine researcher who is open to participation. The latter can include researchers who 
do not have expertise in the field of the SI, relying on their valid credentials and academic 
affiliation to mask the misrepresentation. 

2. Create a proposal for an SI that is related to the scope of the target journal. Proposals that are 
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broad in scope are often put forward, which enables paper mill manuscripts across wide- 
ranging topics to be submitted. 

3. Create papers for the SI that will look legitimate to non-experts. 

Plant inappropriate or unnecessary citations in the text and reference list to boost self- 
citations or citations to other researchers. 

5. Create, manipulate, or hijack the identities of authors and peer reviewers; or influence genuine 
researchers to participate. 

6. Submit the fabricated papers authored by the fake authors to the SI. Or solicit genuine papers 
from authors who are willing to pay for an ‘expedited’ route to publication. 

7. Use the fake or compromised peer reviewers to provide comments on the papers that appear 
legitimate but, like many of the papers, are revealed to be duplicates, unrelated, or otherwise 
unreliable. 

8. While this is going on, potentially also sell co-authorships to customers. 

Accept the papers via the fake or compromised Guest Editor and collect payment from the 


co-authors. 


As indicated above, we note that the deliberate inclusion of inappropriate citations is a frequent 
feature of paper mills (e.g. Tomentella, 2023). In relation to this, the phenomenon of ‘citation rings’ — 
where groups of researchers act together to boost the citation of each other's papers (Bik, 2022) — is 
another form of publication manipulation at scale, although not discussed further here. 


Peer review manipulation 


Peer review, a cornerstone of scholarly publishing, is designed to ensure the quality and credibility 
of academic research. However, “the peer review process is showing signs of strain. A growing 
shortage of skilled reviewers, organised fraud in the form of peer review rings, fake papers and 
manipulated results, and new challenges brought on by new Al tools and large language models, 
all pose threats to the integrity of the process” (COPE Forum, 2023). 


Peer review manipulation and the formation of peer review rings is on the rise. In instances of 
manipulation, dishonest researchers may engage in various deceptive practices, such as submitting 
fake contact information for suggested reviewers, providing false email addresses, or even 
reviewing their own submissions under different identities (Lechner and Evans, 2020; COPE Council, 
2021). 


The formation of peer review rings exacerbates the challenge of upholding unbiased peer 
review by creating networks of individuals willing to provide favorable reviews for each other's 
submissions. These collusive efforts undermine the objectivity and impartiality that peer review 
is intended to uphold. Peer review rings often operate covertly, with members strategically 
submitting and reviewing each other's work to ensure acceptance without genuine scrutiny 
(Ferguson et al, 2014). 
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Many publishers use multiple third-party platforms to help provide reviewer recommendations and 
to verify reviewer (and editor) identity . As part of our investigations at Hindawi, we have uncovered 
instances where those involved in peer review rings and paper mills were deeply embedded 

in various industry databases and the infrastructure used for selection and verification of peer 


reviewers. 


Section 3: What happened at Hindawi 


Sls can benefit journals by allowing them to promote new emerging fields or cross-disciplinary 
topics. They also provide researchers from many different disciplines and regions around the world 
the opportunity to publish the research that is important to them. 


Until October 2022, Hindawi's Sls program largely involved attracting proposals from researchers 
volunteering to act as Guest Editors for Sls. Our program utilized a range of checks to assess the 
expertise of potential Guest Editors and to help verify their identity. These included both manual 
checks and, like most large academic publishers, the use of third-party tools and systems. Although 
several downstream checks were in place at multiple steps along the workflow to publication, 

this process relied to some extent on the inherent trust that has been built up over decades in 

the publishing industry, which assumes that academics act ethically, rigorously, and impartially as 
editors and reviewers, and in line with Hindawi’s guidelines for researchers taking on these roles. 


Despite the checks that were in place, suspicious patterns were identified by Hindawi research 
integrity staff. An initial investigation, prompted by an increasing number of escalations to that 
team coupled with new analytic capabilities afforded by the in-house peer review system, 
provided evidence that suspicious activity was identifiable across multiple Hindawi Sls and the 
findings were escalated to the company’s senior management. Around this time, the scientific 
community, especially independent researchers with an interest in research integrity, also began 
noticing indicators of large-scale systematic manipulation. 


Hindawi research integrity staff undertook a more detailed investigation in September 2022, with 
an initial retraction of 511 articles announced in October 2022 (Kincaid, 2022). Prior to this, Hindawi 
and Wiley raised the alarm to the publishing industry that we had detected paper mill activity at 
scale, including alerting other publishers and relevant industry stakeholders of the presence of bad 
actors in their portfolios and systems . 


These initial investigations coupled with the findings of independent researchers pointed to 

the infiltration of Hindawi Sls at an even greater scale than first anticipated, making it clear that 
thousands of manuscripts would need to be investigated. As a result, October 14 2022 saw Hindawi 
pause the publication of all SI content. We knew we had to both secure our existing workflows and 
clean up the papers we had already published that had been compromised. We therefore made 
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two key decisions. 


The first decision was that every SI manuscript yet to be published would be reassessed to prevent 
the publication of further compromised papers. Suitably qualified editorial staff, and suitably 
qualified vendors overseen by our own staff, subsequently assessed all SI manuscripts in their 
entirety, including the use of a comprehensive checklist of specific hallmarks of papermill papers. 
At the same time, we carried out a thorough review and strengthening of our existing checks on 
editors, authors and reviewers. 


Pausing the Sls program was essential to stop the publication of additional compromised content 
and to ensure that we could have confidence in the quality of future output. This decision had a 
significant financial impact on the business, which is a matter of public record (Butcher, 2023). 


The second major decision was to investigate the thousands of papers published in Hindawi 
Sls. The scale of the task necessitated the development of a new protocol aimed at detecting 
manipulation patterns and retracting papers rapidly and at scale. We go on to describe this 
approach, which required a significant investment in the resources required to support such a 
large-scale investigation process, in more detail below. 


The issues identified within Hindawi Sls during our investigations encompass multiple signs of 
paper mill activity, including problems and patterns relating to the papers themselves, the citations 
within them, and the individuals involved with the peer review process. 


The consequences 


The consequences of papermill papers infiltrating Hindawi's systems and reaching publication 
were severe. At the highest level is the issue of contamination of the scholarly literature and the 
undermining of trust in journals, publishers, and research itself. Also of crucial concern is the 
damaging effect on innocent authors who had published genuine papers in the journals affected 
— journals whose reputations had now been tarnished. 


We also recognized that many of those who had perpetrated the abuse within our portfolio 

had established publication records and profiles elsewhere before engaging with our program. 
Consistent with the ethos that led us to share our early findings with other publishers, we felt a 
deep responsibility for ensuring that the abuses that had proliferated within our portfolio could be 
tracked, identified, and made useful to others. 


As a result, we undertook the complex, resource-intensive process of investigating systematic 
manipulation at scale, working through the incredibly detailed and involved process of retracting 
thousands of papers, while balancing fairness to authors with the urgent need to clean up the 
scholarly record in a timely manner. 
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Retractions at scale: a new approach 


As outlined above, having identified that the Hindawi SI program had been affected at scale, 
Hindawi implemented a new approach to investigation and retraction necessitated by the 
magnitude of the problem and the need to correct the scientific record as quickly as possible. 


Typically, when there is suspected misconduct connected with a single paper or group of papers, 
the editors of the journal, in consultation with a research integrity team and publisher, will take joint 
responsibility to investigate and retract a paper. In dealing with manipulation at scale, however, 
patterns of manipulation must be analyzed across different Sls and journals (Bishop, 2023) . The 
publisher alone has the resource and access to relevant data to enable an investigation at scale 
across multiple articles and journals simultaneously, and to act on what we discover across the 
portfolio. 


From January to March of this year, therefore, we developed a new protocol for research integrity 
investigations at scale, aiming to detect patterns of manipulation across multiple articles 
simultaneously. Key to the approach was bringing together a dedicated, multifunctional team 
including experts in research integrity, publishing operations, and data analysis. 


It was also crucial to take an evidence-informed approach and to adhere as closely as possible to 
the principles that COPE uses to guide investigations of authors and potential retractions. Hindawi 
has been open with COPE about the approach, and COPE themselves have recently released 
specific guidelines to deal with paper mills (COPE Council, 2023b). 


An evidence-informed approach 


The evidence we have been using to retract a paper in this new at-scale approach includes one or 
more of the following indicators of manipulation: 


Discrepancies in scope. 

Discrepancies in the description of the research reported. 

Discrepancies between the availability of data and the research described. 
Inappropriate citations. 

Incoherent, meaningless, and/or irrelevant content included in the article. 
Compromised or manipulated peer-review. 


Nw KRWN > 


The decision to retract, given this evidence, is based on the rationale that the publication process 
has been undermined and we can no longer vouch for the integrity of the article. We have 
intentionally limited the specific details of what is under investigation in the retraction notice, in 
part because it is essential that the intelligence shared with bad actors is restricted. Moreover, to 


Return to contents 


proceed expeditiously and communicate with thousands of authors, standardized wording of 
retraction notices was essential. 


The investigation workflow combined checks performed by suitably qualified and trained 
individuals alongside an analysis of data from internal systems and data shared publicly by 
independent research integrity experts. To proceed at a reasonable pace, we allocated work 

to PhD-qualified staff at vendors. Each paper was assessed by a minimum of two people, and 
assessed a third time if arbitration was needed. Internal staff with expertise trained the vendors, 
and checked for consistency and accuracy, doing additional investigation where needed. 
Responses were collected via a software application we developed specifically for this purpose. In 
parallel, computational tools were used to provide additional supporting evidence about fabricated 
content, plagiarized peer-review reports, and suspect peer review turn-around times. We also built 
on the valuable work done by independent research integrity sleuths, by collecting, evaluating, and 
categorizing comments provided on PubPeer as part of Smut Clyde's list. 


The bespoke software application guided investigators through the questionnaire and enabled 

the same paper to be reviewed by different teams to help ensure consistency. All the variables 
contributed to an overall score for the paper, which was used by Hindawi to make the final decision 
about whether the paper should be retracted. Thresholds were assigned to the scores and any 
article that reached these thresholds were retracted. 


As we gathered the evidence, we have also been able to systematically pinpoint particular bad 
actors - Guest Editors who have been responsible for handling multiple retracted papers. We have 
since sent sanction letters to several hundred Guest Editors notifying them that they can no longer 


take on an editorial role in our journals or publish with us in future. 
What are we doing now? 


To date, thousands of papers have been retracted from Hindawi Sls as a result of our large-scale 
investigation and we continue to investigate and issue retractions where appropriate. In some 
cases, we suspect that additional papers in particular Sls are likely to be problematic. Where these 
papers have not yet been investigated — or where we do not have specific indications of problems 
with individual papers — we plan to issue Expressions of Concern at the level of whole Sls to alert 
readers that they should take additional care interpreting all papers in the SI — whether retracted or 


not. 


Peer review processes — both at Hindawi and more widely across Wiley — are being continually 
strengthened through increased controls and expert staffing, new workflows, and Al-based tools 
that are being developed and refined to significantly improve manuscript screening. Much more 
stringent checks on all SI proposals and Guest Editors have been introduced to provide a clearer 
understanding of Editor identity, research experience, publication history, and indicators of previous 
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unethical publishing activity before allowing those researchers to lead Sls going forwards. Informed 
by the learnings of our investigations, new checks have been introduced at multiple steps in the 
publication process, including much greater scrutiny of peer review. 


Section 4: Conclusion and call to action 
The need for a collaborative approach 


Painful lessons have been learned over the past year, by Hindawi and several other publishers. We 
are committed to acting in the interests of all in countering the increasing threat and sophistication 
of those who undermine the integrity of published research for their own gain. 


It is vital that publishers and other stakeholders (including funders, institutions, and indexers) 
continue to come together to share intelligence on the activities of papermills and other bad 
actors, update each other on their approaches to investigation and retraction, and share new 
processes to combat publication misconduct at scale. For example, the STM Integrity Hub is 
enabling this collaboration by acting as a “knowledge exchange: where publishers may share 
experiences and learnings regarding how best to safeguard research integrity in science; a think 
tank for policy and legal frameworks; and a living library of infrastructure and tools” (STM, 2023). 
The infrastructure that they have built, and the knowledge-sharing communities established to 
support prioritization, are essential components of translating that collaboration into practice. 


There is important work underway led by the United 2 Act initiative, initially conceived through a 
forum set up by COPE and the STM Association, bringing stakeholders together to establish areas 
of focus and prioritization. We will be actively working with this diverse group and developing a 
shared strategy to mitigate the impact of systematic manipulation. It is critically important that 
we continue to speak out and share what we have learned as a community, whether in our roles 
as journal editors, publishers, or researchers with an interest in research integrity. COPE provided 
an excellent forum for this community participation during its Publication Integrity Week held in 
October of this year. 


Standardization across research publishing is also essential to establishing a cohesive approach to 
upholding research integrity. The work that the National Information Standards Organization (NISO) 
is leading to establish best practices around post-publication amendments, their publication, and 
consistent metadata has been available for public consultation since October (NISO, 2023). 


One consequence of what we have learned from the experience of Hindawi is a growing 
awareness of the vulnerability to manipulation not just of our own workflows but also that of the 
third-party services we rely on and the downstream platforms that provide services and metrics 
for scholarly communication more widely. For example, we, like many other publishers, rely on 
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trusted third-party providers to help find recommendations for reviewers. Such platforms often 
rank reviewer performance and reliability based on the number of reviews an individual has done. 
If these platforms are infiltrated with bad actors using false or multiple identities, they can in turn 
infiltrate the reviewer databases of many publishers simultaneously. In addition, downstream 
services that provide information about author identity or that rely on citations and mining the 
literature to evaluate researchers will also be impacted. To date, there is little published evidence 
of the impact of wide scale manipulation on the scholarly infrastructure, although a recent 
analysis of ‘sneaked references’ (Lonni, 2023; Chawla, 2023) and the impact of ‘hijacked’ journals on 
bibliographic databases points to the insidious damage this can cause (Abalkina, 2023). Given the 
scale of manipulation that is emerging across multiple publishers (van Noorden, 2023), the knock- 
on effects are therefore likely to impact both researcher evaluation and knowledge discovery much 
more widely. There is a pressing need to take collective responsibility and collective action. 


In summing up, we hope that our message is clear: publishers need to work together — and with 

all stakeholders in the scholarly publishing ecosystem — and devote substantial resources to ensure 
the integrity of our journals and the content we publish . From the experience of Hindawi, we have 
learned a great deal and offer below our recommendations for other publishers and stakeholders : 


We need to support the spirit of a trust-based research publishing ecosystem in the form 
of a ‘trust-but-verify’ system, in which participants have independently verified and secure 
identities that link accounts to real individuals. 


Publishers should continue — and accelerate — the responsible sharing of data and techniques, 
analogous to the successful work done within the financial industry to combat fraud. Such 
sharing — both between publishers and with other appropriate stakeholders — is crucial to 
prevent the momentum of bad actors and to preserve the quality of the scholarly literature. 


Publishers should continually review their vetting processes for editorial board members 
and any Guest Editors, alongside careful consideration of the responsibilities given to Guest 
Editors. 


Publishers should ensure that retraction notices are linked to the original article by updating 
the metadata of the article and made publicly discoverable as soon as possible to readers 
and machines. Ideally this should be done through centralized community-run metadata 
services such as Crossref so that downstream service providers can also ensure their 
platforms, services, and records are up to date (see also NISO, 2023). 


We need to establish best practices around when to add a notice or flag to a set of papers 
that carry hallmarks of a potential paper mill while an investigation is underway. This gives 
balance to our responsibility to investigate while also notifying the community that valid 
concerns have been raised. In the same spirit, we need to continue to commit resource to the 
efficient investigations and their outcomes. 


Publishers, and third parties on whom publishers rely for data and identity verification, need 
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to recognize the challenges inherent in this environment — including the rapid developments 
with artificial intelligence - and upgrade their capabilities accordingly. 


All stakeholders need to incentivize open research practices such that the focus is not only 
on the final published article; this would both bring increased transparency to the research 
process and act as a means of mitigating research integrity concerns. 


While Hindawi and Wiley have taken many concrete actions in both the short- and long-term, there 
is much to do. The issues covered in this white paper are multi-faceted and stretch across the entire 
scholarly publishing ecosystem. The hope is that in sharing these insights, a useful perspective and 
suggestions of action have been provided, regarding an issue threatening to impact the future 
health of our industry and, critically, society’s trust in scholarly research. 
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